static game
Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions
Recent extensions to dynamic games of the well known fictitious play learning procedure in static games were proved to globally converge to stationary Nash equilibria in two important classes of dynamic games (zero-sum and identical-interest discounted stochastic games). However, those decentralized algorithms need the players to know exactly the model (the transition probabilities and their payoffs at every stage). To overcome these strong assumptions, our paper introduces regularizations of the recent algorithms which are moreover, model-free (players don't know the transitions and their payoffs are perturbed at every stage). Our novel procedures can be interpreted as extensions to stochastic games of the classical smooth fictitious play learning procedures in static games (where players best responses are regularized, thanks to a smooth perturbation of their payoff functions). We prove the convergence of our family of procedures to stationary regularized Nash equilibria in the same classes of dynamic games (zero-sum and identical interests discounted stochastic games). The proof uses the continuous smooth best-response dynamics counterparts, and stochastic approximation methods.
On the Convergence of No-Regret Learning Dynamics in Time-Varying Games
Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also apply to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games.
No-Regret Learning in Bayesian Games
Jason Hartline, Vasilis Syrgkanis, Eva Tardos
Recent price-of-anarchy analyses of games of complete information suggest that coarse correlated equilibria, which characterize outcomes resulting from no-regret learning dynamics, have near-optimal welfare. This work provides two main technical results that lift this conclusion to games of incomplete information, a.k.a., Bayesian games. First, near-optimal welfare in Bayesian games follows directly from the smoothness-based proof of near-optimal welfare in the same game when the private information is public.
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Illinois > Cook County > Evanston (0.04)
A Game-Theoretic Framework for Distributed Load Balancing: Static and Dynamic Game Models
Fardno, Fatemeh, Etesami, Seyed Rasoul
Motivated by applications in job scheduling, queuing networks, and load balancing in cyber-physical systems, we develop and analyze a game-theoretic framework to balance the load among servers in both static and dynamic settings. In these applications, jobs/tasks are often held by selfish entities that do not want to coordinate with each other, yet the goal is to balance the load among servers in a distributed manner. First, we provide a static game formulation in which each player holds a job with a certain processing requirement and wants to schedule it fractionally among a set of heterogeneous servers to minimize its average processing time. We show that this static game is a potential game and admits a pure Nash equilibrium (NE). In particular, the best-response dynamics converge to such an NE after $n$ iterations, where $n$ is the number of players. We then extend our results to a dynamic game setting, where jobs arrive and get processed in the system, and players observe the load (state) on the servers to decide how to schedule their jobs among the servers in order to minimize their averaged cumulative processing time. In this setting, we show that if the players update their strategies using dynamic best-response strategies, the system eventually becomes fully load-balanced and the players' strategies converge to the pure NE of the static game. In particular, we show that the convergence time scales only polynomially with respect to the game parameters. Finally, we provide numerical results to evaluate the performance of our proposed algorithms under both static and dynamic settings.
Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions
Recent extensions to dynamic games of the well known fictitious play learning procedure in static games were proved to globally converge to stationary Nash equilibria in two important classes of dynamic games (zero-sum and identical-interest discounted stochastic games). However, those decentralized algorithms need the players to know exactly the model (the transition probabilities and their payoffs at every stage). To overcome these strong assumptions, our paper introduces regularizations of the recent algorithms which are moreover, model-free (players don't know the transitions and their payoffs are perturbed at every stage). Our novel procedures can be interpreted as extensions to stochastic games of the classical smooth fictitious play learning procedures in static games (where players best responses are regularized, thanks to a smooth perturbation of their payoff functions). We prove the convergence of our family of procedures to stationary regularized Nash equilibria in the same classes of dynamic games (zero-sum and identical interests discounted stochastic games).
On the Convergence of No-Regret Learning Dynamics in Time-Varying Games
Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times.
No-Regret Learning in Bayesian Games
Recent price-of-anarchy analyses of games of complete information suggest that coarse correlated equilibria, which characterize outcomes resulting from no-regret learning dynamics, have near-optimal welfare. This work provides two main technical results that lift this conclusion to games of incomplete information, a.k.a., Bayesian games. First, near-optimal welfare in Bayesian games follows directly from the smoothness-based proof of near-optimal welfare in the same game when the private information is public.
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Illinois > Cook County > Evanston (0.04)
Polynomial-time Approximation Scheme for Equilibriums of Games
Sun, Hongbo, Xia, Chongkun, Yuan, Bo, Wang, Xueqian, Liang, Bin
Nash equilibrium[1] of normal-form game was proposed decades ago, yet even whether PTAS exists for it remains undecided, not to mention for equilibriums of games with dynamics. PTAS for equilibriums of games is important itself in game theory, and the confirmation of its existence may impact multi-agent reinforcement learning research. First, the existence of PTAS relates to the practicality of the amount of computational power in achieving equilibriums of large scale games. It has been proved that exactly computing a Nash equilibrium of a static game is in PPAD-hard class of complexity[2]. Ignoring the possibility that PPAD itself is of polynomial-time[3], PTAS describes methods that approximately compute Nash equilibriums efficiently. Second, the confirmation of previously unknown existence of PTAS for games implies possibility to fundamentally solve the problems of non-stationarity in training and curse of dimensionality[4] in multi-agent reinforcement learning at the same time. Both the two problems are related to the absence of PTAS for equilibriums of games. Non-stationarity in training relates to the fact that existing polynomial-time methods lack convergence guarantee to equilibriums, and curse of dimensionality relates to the fact that methods with convergence guarantee lack polynomial-time complexity.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- (2 more...)
A Coupling Approach to Analyzing Games with Dynamic Environments
Collins, Brandon C., Xu, Shouhuai, Brown, Philip N.
The theory of learning in games has extensively studied situations where agents respond dynamically to each other by optimizing a fixed utility function. However, in real situations, the strategic environment varies as a result of past agent choices. Unfortunately, the analysis techniques that enabled a rich characterization of the emergent behavior in static environment games fail to cope with dynamic environment games. To address this, we develop a general framework using probabilistic couplings to extend the analysis of static environment games to dynamic ones. Using this approach, we obtain sufficient conditions under which traditional characterizations of Nash equilibria with best response dynamics and stochastic stability with log-linear learning can be extended to dynamic environment games. As a case study, we pose a model of cyber threat intelligence sharing between firms and a simple dynamic game-theoretic model of social precautions in an epidemic, both of which feature dynamic environments. For both examples, we obtain conditions under which the emergent behavior is characterized in the dynamic game by performing the traditional analysis on a reference static environment game.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.05)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
Modern Game Theory and Multi-Agent Reinforcement Learning Systems
Most artificial intelligence(AI) systems nowadays are based on a single agent tackling a task or, in the case of adversarial models, a couple of agents that compete against each other to improve the overall behavior of a system. However, many cognition problems in the real world are the result of knowledge built by large groups of people. Take for example a self-driving car scenario, the decisions of any agent are the result of the behavior of many other agents in the scenario. Many scenarios in financial markets or economics are also the result of coordinated actions between large groups of entities. How can we mimic that behavior in artificial intelligence(AI) agents?
- Leisure & Entertainment > Games (0.55)
- Information Technology (0.36)